This is a submission for assignment Visualization in R using ggplot2 Link to assignment
The content covers:
rows <- nrow(df)
cols <- ncol(df)
#datatable(df)
Data frame has 22 variables/columns and 16433 measurement/rows in marathon data set
it can be said that most of the participants are from young and middle age and the pattern is similar between both the gender.
it can be seen from the below diagram that men have finished race a bit faster that women but the overall distribution of data remains the same which can imply that both the gender are competing optimally given that number of female participants is around 5607 while men participants is around 10826.
it can be seen that women have a light more disqualification in old ages than men
it can be seen from the plot that gun time is not a reliable but more of a ceremonial way to measure finish time , as it deviates a-lot from actual chip time
based on race finish time , participants can be put into type runner is finish time was below 3hrs , jogger if finish time was between 3hrs and 5hrs and walkers if time was more than 5hrs
##figure below shows how stornger are variables like time between start to first 10 km , time between 10km to halfway and time bwteen halfway to end correlated with gender and Category conclusions can be drawn that, fast finishers of at early stages of races are more likely to have better overall position, which is obvious
#figure below shows positons of top 10 finishers at different stages of marathon
it can be seen that David (2nd position) actually performed better through out the race except in the last stage , he has high chances of winning future marathon as his performace is more consistent